Search CORE

1,103 research outputs found

Optimal Rates for Random Fourier Features

Author: Sriperumbudur Bharath K.
Szabo Zoltan
Publication venue
Publication date: 01/01/2015
Field of study

Kernel methods represent one of the most powerful tools in machine learning to tackle problems expressed in terms of function values and derivatives due to their capability to represent and model complex relations. While these methods show good versatility, they are computationally intensive and have poor scalability to large data as they require operations on Gram matrices. In order to mitigate this serious computational limitation, recently randomized constructions have been proposed in the literature, which allow the application of fast linear algorithms. Random Fourier features (RFF) are among the most popular and widely applied constructions: they provide an easily computable, low-dimensional feature representation for shift-invariant kernels. Despite the popularity of RFFs, very little is understood theoretically about their approximation quality. In this paper, we provide a detailed finite-sample theoretical analysis about the approximation quality of RFFs by (i) establishing optimal (in terms of the RFF dimension, and growing set size) performance guarantees in uniform norm, and (ii) presenting guarantees in

L^r

(

1\le r<\infty

) norms. We also propose an RFF approximation to derivatives of a kernel with a theoretical study on its approximation quality.Comment: To appear at NIPS-201

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

Improving Sampling from Generative Autoencoders with Markov Chains

Author: Arulkumaran K
Bharath AA
Creswell A
Publication venue
Publication date: 31/12/2016
Field of study

We focus on generative autoencoders, such as variational or adversarial autoencoders, which jointly learn a generative model alongside an inference model. We define generative autoencoders as autoencoders which are trained to softly enforce a prior on the latent distribution learned by the model. However, the model does not necessarily learn to match the prior. We formulate a Markov chain Monte Carlo (MCMC) sampling process, equivalent to iteratively encoding and decoding, which allows us to sample from the learned latent distribution. Using this we can improve the quality of samples drawn from the model, especially when the learned distribution is far from the prior. Using MCMC sampling, we also reveal previously unseen differences between generative autoencoders trained either with or without the denoising criterion

Spiral - Imperial College Digital Repository

Caching with Unknown Popularity Profiles in Small Cell Networks

Author: Bharath B. N.
Nagananda K. G.
Publication venue
Publication date: 26/07/2015
Field of study

A heterogenous network is considered where the base stations (BSs), small base stations (SBSs) and users are distributed according to independent Poisson point processes (PPPs). We let the SBS nodes to posses high storage capacity and are assumed to form a distributed caching network. Popular data files are stored in the local cache of SBS, so that users can download the desired files from one of the SBS in the vicinity subject to availability. The offloading-loss is captured via a cost function that depends on a random caching strategy proposed in this paper. The cost function depends on the popularity profile, which is, in general, unknown. In this work, the popularity profile is estimated at the BS using the available instantaneous demands from the users in a time interval

[0,\tau]

. This is then used to find an estimate of the cost function from which the optimal random caching strategy is devised. The main results of this work are the following: First it is shown that the waiting time

\tau

to achieve an

\epsilon>0

difference between the achieved and optimal costs is finite, provided the user density is greater than a predefined threshold. In this case,

\tau

is shown to scale as

N^2

, where

N

is the support of the popularity profile. Secondly, a transfer learning-based approach is proposed to obtain an estimate of the popularity profile used to compute the empirical cost function. A condition is derived under which the proposed transfer learning-based approach performs better than the random caching strategy.Comment: 6 pages, Proceedings of IEEE Global Communications Conference, 201

arXiv.org e-Print Archive

Crossref

Asymptotics of a Clustering Criterion for Smooth Distributions

Author: Bharath Karthik
Dey Dipak K
Pozdnyakov Vladimir
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

We develop a clustering framework for observations from a population with a smooth probability distribution function and derive its asymptotic properties. A clustering criterion based on a linear combination of order statistics is proposed. The asymptotic behavior of the point at which the observations are split into two clusters is examined. The results obtained can then be utilized to construct an interval estimate of the point which splits the data and develop tests for bimodality and presence of clusters

arXiv.org e-Print Archive

Crossref

k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data

Author: Elmehdwi Yousef
Jiang Wei
Samanthula Bharath K.
Publication venue
Publication date: 06/08/2014
Field of study

Data Mining has wide applications in many areas such as banking, medicine, scientific research and among government agencies. Classification is one of the commonly used tasks in data mining applications. For the past decade, due to the rise of various privacy issues, many theoretical and practical solutions to the classification problem have been proposed under different security models. However, with the recent popularity of cloud computing, users now have the opportunity to outsource their data, in encrypted form, as well as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form, existing privacy preserving classification techniques are not applicable. In this paper, we focus on solving the classification problem over encrypted data. In particular, we propose a secure k-NN classifier over encrypted data in the cloud. The proposed k-NN protocol protects the confidentiality of the data, user's input query, and data access patterns. To the best of our knowledge, our work is the first to develop a secure k-NN classifier over encrypted data under the semi-honest model. Also, we empirically analyze the efficiency of our solution through various experiments.Comment: 29 pages, 2 figures, 3 tables arXiv admin note: substantial text overlap with arXiv:1307.482

arXiv.org e-Print Archive

CiteSeerX

Montclair State University Digital Commons